An Overview of Different Binary Methods for Documents Based on Their Features
نویسندگان
چکیده
this paper surveys binarization of document images. The main role of binarization is dimension and noise reduction. Binarization is one of the most important steps in preprocessing of document image understanding and compression. Image binarization means to classify image pixels into two classes, background and foreground. The input of this classification is a feature vector based on intensity values of image pixels. The new features are extracted from the first input vector and, according to the extracted features a cost function as a classifier is constructed. The intensity value that maximizes the cost function is considered as the boundary line of two classes. This paper divides the binarization algorithms into three groups. The first considers one input feature vector including intensity values of each pixel. The second one considers an input feature vector for each pixel based on the intensity value of the pixel and its neighbors. The third group is based on a combination of the first and second group of schemes.
منابع مشابه
RRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features
Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the rank...
متن کاملAn Improvement in Support Vector Machines Algorithm with Imperialism Competitive Algorithm for Text Documents Classification
Due to the exponential growth of electronic texts, their organization and management requires a tool to provide information and data in search of users in the shortest possible time. Thus, classification methods have become very important in recent years. In natural language processing and especially text processing, one of the most basic tasks is automatic text classification. Moreover, text ...
متن کاملAn Overview of Nonlinear Spectral Unmixing Methods in the Processing of Hyperspectral Data
The hyperspectral imagery provides images in hundreds of spectral bands within different wavelength regions. This technology has increasingly applied in different fields of earth sciences, such as minerals exploration, environmental monitoring, agriculture, urban science, and planetary remote sensing. However, despite the ability of these data to detect surface features, the measured spectrum i...
متن کاملFacial expression recognition based on Local Binary Patterns
Classical LBP such as complexity and high dimensions of feature vectors that make it necessary to apply dimension reduction processes. In this paper, we introduce an improved LBP algorithm to solve these problems that utilizes Fast PCA algorithm for reduction of vector dimensions of extracted features. In other words, proffer method (Fast PCA+LBP) is an improved LBP algorithm that is extracted ...
متن کاملTranslation Evaluation in Educational Settings for Training Purposes
The following article describes different methods and techniques used in educational settings for translation evaluation. Translation evaluation is the placing of value on a translation i.e. awarding a mark, even if only a binary pass/fail one. In the present study, different features of the texts chosen for evaluation were firstly considered and then scoring the t...
متن کاملFeature extraction in opinion mining through Persian reviews
Opinion mining deals with an analysis of user reviews for extracting their opinions, sentiments and demands in a specific area, which can play an important role in making major decisions in such area. In general, opinion mining extracts user reviews at three levels of document, sentence and feature. Opinion mining at the feature level is taken into consideration more than the other two levels d...
متن کامل